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Abstract 

Given an independent and identically distributed source X = {X^}^^ with finite Shannon entropy 
or differential entropy (as the case may be) H{X), the non-asymptotic equipartition property (NEP) 
with respect to H{X) is established, which characterizes, for any finite block length n, how close 
— ^ lnp(XiX2 • • • X„) is to H{X) by determining the information spectrum of XiX2 - ■ ■ Xn, i.e., 
the distribution of —^\np{XiX2---Xn). Non-asymptotic equipartition properties (with respect to 
conditional entropy, mutual information, and relative entropy) in a similar nature are also established. 
These non-asymptotic equipartition properties are instrumental to the development of non-asymptotic 
coding (including both source and channel coding) results in information theory in the same way as the 
asymptotic equipartition property to all asymptotic coding theorems established so far in information 
theory. As an example, the NEP with respect to H{X) is used to establish a non-asymptotic fixed 
rate source coding theorem, which reveals, for any finite block length n, a complete picture about the 
tradeoff between the minimum rate of fixed rate coding of Xi - ■ ■ X„ and error probability when the 
error probability is a constant, or goes to with block length n at a sub-polynomial nr", < a < 1, 
polynomial n^", a > 1, or sub-exponential e~" , < a < 1, speed. In particular, it is shown 
that for any finite block length n, the minimum rate (in nats per symbol) of fixed rate coding of 
X1X2 ■■■X.a with error probabiHty 6 (^75^) is ii{X) + ^c7|^(X)(2a)y^+0(^), where a > 
and (y\i{X) = E[- lnp(Xi)]2 - H'^{X) is the information variance of X. With the help of the NEP 
with respect to other information quantities, non-asymptotic channel coding theorems of similar nature 
will be established in a separate paper. 

Index Terms 

Asymptotic equipartition property (AEP), conditional entropy, entropy, fixed rate coding, informa- 
tion spectrum, mutual information, non-asymptotic equipartition property (NEP). 
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I. Introduction 

Consider an independent and identically distributed (IID) source X = {Xi}°Z^ with source 
alphabet X and finite entropy H(X), where H(X) is the Shannon entropy of Xj if X is discrete, 
and the differential entropy of Xj if X is the real line and each Xi is a continuous random 
variable. Let p{x) be the probability mass function (pmf) or probability density function (pdf) 
(as the case may be) of X^. The asymptotic equipartition property (AEP) for X is the assertion 
that 

- - \np{X,X2 ■ ■ ■ X„) ^ H{X) (1.1) 
n 

either in probability or with probability one as n goes to oo. It implies that for sufficiently 
large n, with high probability, the outcomes of X1X2 ■ ■ ■ X„ are approximately equiprobable 
with their respective probability ranging from e-"(^{^)+<:) to e~"*^^("^)^^), where e > is a small 
fixed number. Here and throughout the rest of the paper. In stands for the logarithm with base 
e, and all information quantities are measures in nats. 

The AEP is fundamental to information theory. It is not only instrumental to lossless source 
coding theorems, but also behind almost all asymptotic coding (including source, channel, and 
multi-user coding) theorems through the concepts of typical sets and typical sequences [jT|. 

However, in the non-asymptotic regime where one wants to establish non- asymptotic coding 
results for finite block length n, the AEP in its current form can not be applied in general. In 
this paper, we aim to establish the non-asymptotic counterpart of the AEP, which is broadly 
referred to as the non- asymptotic equipartition property (NEP), so that the NEP can be applied 
to finite block length n. Specifically, with respect to H{X), we first characterize, for any finite 
block length n, how close — Mnp(XiX2 ■ ■ ■ X„) is to H(X) by determining the information 
spectrum of XiX2---X„, i.e., the distribution of — Mnj9(XiX2 ■ ■ ■ X„); such a property is 
referred to as the NEP with respect to H{X). For any IID source pair (X, F) = {(Xj,Fj)}^^ 
with finite conditional entropy H{X\Y) and mutual information /(X; Y), where H{X\Y) is the 
Shannon conditional entropy of Xj given Fj if X is discrete, and the conditional differential 
entropy of Xj given Yi if X is continuous, we then examine, for any finite block length n, how 
close -Mnp(X"|F") (-Mn , respectively) is to H{X\Y) (/(X;F), respectively) by 

determining the distribution of — Mnj9(X"|y") (— Mn ^^^j^^-^, respectively), where p(x"|y") 
{p{y^\x'^), respectively) is the conditional pmf or pdf (as the case may be) of = X1X2 ■ ■ ■ Xn 
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(y" = ?/i?/2 ■ ■ ■ Un, respectively) given ?/" (x", respectively); these properties are referred to as 
the NEP with respect to H{X\Y) and /(X; Y), respectively. 

In the same way as the AEP plays an important role in establishing the asymptotic coding 
(including source, channel, and multi-user coding) results in information theory, our established 
NEP is also instrumental to the development of non-asymptotic source and channel coding 
results. Using the NEP with respect to H{X), we further establish a non-asymptotic fixed rate 
source coding theorem, which reveals, for any finite block length n, a complete picture about the 
tradeoff between the minimum rate of fixed rate coding of Xi ■ ■ ■ X„ and error probability when 
the error probability is a constant, or goes to with block length n at a sub-polynomial 
< « < 1, polynomial a > 1, or sub-exponential e~"°, < a < 1, speed. In particular, it 
is shown that for any finite block length n, the minimum rate (in nats per symbol) of fixed rate 
coding of X1X2 ■■■Xn with error probability (^^) is H{X) + y/aji{X){2a)^^+0C^), 
where a > and cr'jj(X) = E[— Inp(Xi)]^ — H'^(X) is the information variance of X. In a 
separate paper []3|, non-asymptotic channel coding theorems of similar nature will be established 
with the help of the NEP with respect to other information quantities; in particular, it is shown 
||3| that for any binary input memoryless channel with uniform capacity achieving input X, 



random linear codes of block length n can reach within A/o"|^(X|F)(2a)y ^ + O(^) of 
the channel capacity while maintaining word error probability 6 (^:^j=jj where a > and 
cr'jj{X\Y) = E[— logp(X|F)]^ — H'^{X\Y) is the conditional information variance of X given 
Y with Y being the output of the channel in response to the input X. 

The rest of the paper is organized as follows. Section |II] is devoted to the NEP with respect to 
H{X). All results in Section |ll] are then extended to the case of H{X\Y) in Section III thereby 



establishing the NEP with respect to H{X\Y). In Section |IV] we analyze the NEP with respect 
to the mutual information and relative entropy. Finally, in Section |V], we apply the NEP with 
respect to H{X) to investigate the performance of optimal fixed rate coding of X1X2 ■ ■ ■ X„. 

II. NEP With Respect to Entropy 

Define 

A*(X) =sup|a > : j p'^+\x)dx < 00^ (2.1) 
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where / dx is understood throughout this paper to be the summation over the source alphabet 
of X if X is discrete. Suppose that 

A*(X) > . (2.2) 



Let 



al{X) = J p{x)[-\np{x)]^dx - H\X) 



(2.3) 



which will be referred to as the information variance of X. It is not hard to see that under the 



assumption (2.2), 



and 



p ^^^{x)dx < oo 



lnp(a;)|'^ dx < oo 



(2.4) 



for any A G (0, A*(X)) and any positive integer k. Further assume that 

cr|(X) > and j p{x) \ In p{x)\^dx < oo . (2.5) 

Then we have the following result, which will be referred to as the weak right NEP with respect 
to H{X). 

Theorem 1 (Weak Right NEP). For any S > 0, let 



rxiS) =sup 

A>0 

Then the following hold: 

(a) For any positive integer n, 



XiH{X) + 5)-\n I p-^+\x)dx 



Pr lnp(X") > H{X) + < e'^'^^^) 



(2.6) 



where X" = X1X2 ■ ■ ■ X„. 



(b) Under the assumptions ( |2.2[ ) and ( |2.5[ ), there exists a 5* > such that for any 5 G (0, 5*] 
and any positive integer n, 

1 



2aUX) 



and hence 



Pr <J -^lnp(X'^) > H{X) + 5\ < e "^^4^^^^'''^^ 



(2.7) 



(2.8) 
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Proof of Theorem [7]- The inequality ( |2.6| ) follows from the Chemoff bound. To see this is 
indeed the case, note that 



Pr 



1 



n 



lnp(XiX2---X„) >if(X) + 5 



< inf 

A>0 



Pr {- lnp(XiX2 ■ ■ ■ X„) > n{H{X) + 5)] 

g|-g-Alnp(XiX2-X„)j 
gnA(_ff(X)+5) 

inf e-"[^(^(^)+'^)-i'^E[p~^(^i)l] 

A>0 

inf e-"[^(^(^)+'5)-''^/p~^^'(^)H 

A>0 
-nrx(5) 



(2.9) 



To show ( |2.7| ) and ( |2.8| ), we first analyze the property of rx((5) as a function of 6 over the 
region 5 > 0. It is easy to see that rx{5) is convex and non-decreasing. For any A G [0, A*(X)), 
define 



5{\) 



[-\np{x)]dx-H{X) 



(2.10) 



[jp-^+\y)dy] 

which, in view of ( |2.4[ ), is well defined. Using a similar argument as in [|4} Properties 1 to 3], 



it is not hard to show that under the assumption ( |2.2[ ), 6{\) as a function of A is continuously 
differentiable up to any order over A G (0, A*(X)). Taking the first order derivative of 5(A) yields 



5'{\) 



Inp(x)]^ dx — 



n 2 



[/p-^+l(^/)^^/] 



Inp(x)] dx 



> 



(2.11) 



where the last inequality is due to ( |2.5| ). It is also easy to see that 5(0) = and 5'{0) = a'jj{X). 
Therefore, 5(A) is strictly increasing over A G [0, A*(X)). On the other hand, it is not hard to 
verify that under the assumption ( |2.2[ ), the function \(H{X) + S) — \n J p^^^^(x)dx as a function 
of A is continuously differentiable over A G [0, A*(X)) with its derivative equal to 



5 -5(A) 



(2.12) 



To continue, we distinguish between two cases: (1) A*(X) = oo, and (2) A*(X) < oo. In case 
(1), since 5(A) is strictly increasing over A G [0, oo), it follows that for any 5 = 5(A) for some 
A G [0, A*(X)), the supremum in the definition of rx(5) is actually achieved at that particular 



A, i.e.. 



r;^(5(A)) = A(iJ(X) + 5(A)) - In / p-^+\x)dx . 



(2.13) 
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In case (2), we have that for any 6 = (5(A) for some A G [0, A*(X)) , 

(3{H{X) + 5(A)) - p~'^+\x)dx < A(iJ(X) + 5(A)) -\nj p-^+\x)dx (2.14) 
for any /3 G [0,A*(X)) with /3 ^ X. In view of the definition of A*(X), ( |2.14[ ) remains valid 



for any /3 > A*(X) since then the left side of ( |2.14[ ) is — oo. What remains to check is when 
/3 = A*(X). If 



P 



x)dx = oo 



it is easy to see that ( |2.14| ) holds as well when j3 = X*{X). Suppose now 

p~^'(^)+^(x)rfx < oo . 



In this case, it follows from the dominated convergence theorem that 



lim 



p 



-/3+1, 



P 



-A*(X)+1, 



x)dx 



and hence by letting (3 go to A*(X) from the left, we see that p.l4| ) holds as well when (3 = 
A*(X). Putting all cases together, we always have that for any 5 = 5{X) for some A G [0, A*(X)), 

rx{S{X)) = X{H{X) + (5(A)) - In / p-^+\x)dx . 



(2.15) 



Let 



A*(X) = lim (5(A) . 

AtA*(X) 



Since both (5(A) and In / p^^^^{x)dx are continuously differentiable with respect to A G (0, A*(X)) 
up to any order, it follows from ( |2.15[ ) that rx (5) is also continuously differentiable with respect 
to (5 G (0, A*(X)) up to any order. (At 5 = 0, rx((5) is continuously differentiable up to at least 
the third order inclusive.) Taking the first and second order derivatives of rx{5) with respect to 
(5, we have 

dr-xiS) 



MS) 



d6 

drx{S{X))dX 
dX d6 

drxjSjX)) 1 
dX 6'{X) 



6'{X) 
X 



H{X) + 6{X) + X6'{X) 



J9~^+^(x) 



[Jp-'+'{y)dy] 



lnp{x)] dx 



(2.16) 
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and 



1 



(2.17) 



where 5 = S{\). Therefore, rx{S) is convex, strictly increasing, and continuously differentiable 



up to at least the third order (inclusive) over 5 E [0, A*(X)). Note that from ( |2.16| ) and ( |2.17[ ), 
we have r^(0) = and r^(0) = l/a'jj{X). Expanding rx{S) at 5 = by the Taylor expansion, 
we then have that there exists a 6* > such that 

1 



rxiS) 



5' + 0{5' 



(2.18) 



for 6 G (0,(5*]. The inequality ( |2.8[ ) now follows immediately from ( |2.6[ ) and ( |2.18[ ). This 
completes the proof of Theorem [1} ■ 
Having analyzed the function rx(5), we are now ready for a stronger version of the right 
NEP. For any A G [0, A*(X)), define 



al{X,X) ^ / Mx)p{x)\~\np{x)-{H{X) + 6{X))fdx 



Mh{X,X) = I fx{x)p{x)\-\np{x)~{H{X)+6{X))fdx 



and 



(2.19) 
(2.20) 
(2.21) 

(2.22) 



i=l 

where S{X) is defined in ( |2?T0l ). Write Mh{X,0) as Mh{X). It is easy to see that cr|^(X,0) 
al{X),aUX, A) = 5'(A), and 

Mh{X) = j p{x)\-\np{x) - H{X))fdx . 

Then we have the following stronger result. 



(2.23) 



Theorem 2 (Strong Right NEP). Under the assumptions ( |2.2[ ) and ( |2.5[ ), the following hold: 
(a) For any 6 G (0, A*(X)) and any positive integer n 



^h{X, a, n)e-"'--('^) > Pr { -- lnp(X") > H{X) + 6\> f{X, A, n)e 



n 



-nrx{S) 



(2.24) 
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where A = r'-^i^S) > 0, 

^ (Y X ^ 2CMh(X,A) 



n\'^a-'jr(X,\) ^ 

+ e ^ [Qiy/^XaniX, A)) - Q(p* + V^XaniX, A))] (2.25) 

riA2CT?,(X,A) 

e^(X, A,n) = e ^ Q{p, + v^A(Th(X, A)) (2.26) 



C < 1 is the universal constant in the central limit theorem of Berry and Esseen. 



(b) For any 6 < cy where c < aniX) is a constant, 

\aH{X)J Vncrjj{X) { n 

\(7h{X)J v^<(X) 
Proo/ o/ r/zeorem |2j- From ( |2.15[ ), it follows that with A = r^(5) 

rx{5) = X{H{X) + (5) - In J p-^+\x)dx . (2.28) 

Then it is not hard to verify that 

Pr |--lnp(X") > H(X) + 5 

n 



p(a;")cix" 

\ \iip{x-^)>H{X)+S 
i- lnp{x")>H{X)+5 
i- lnp{x")>H{X)+5 

^-n[-^Xlnpix-)-XmX)+S)+r^i5)]f^^^n^p^^n^^^n 
l\np(x")>H{X)+S 

-nr^(S) j ^.nX[-^-Xnpix-)-iH(X)+5)\ f^^^n^p^^n^^X^ 

-^\np{x^-)>H(X)+5 

-ilnp(x")>H(X)+5 
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-nrx{S) 



p>0 - lnp(x")-n{H{X)+6) _ 
\/na^ (X .X) 



-nrx{S) 



+00 



-nrx{S) 



+00 



where the last equahty is due to integration by parts, 



(2.29) 



I y/naH{X,X} 

and {ZiYl^^ are IID random variables with pmf or pdf (as the case may be) fx{x)p{x). Let 

+00 

en = F„(0)- y v^Aor^f(X,A)e-v^^'^«(^'")''i^„(p)t/p (2.30) 



+ C!0 



nXaniX, X)e 



[F„(0) - K{p)]dp 



(2.31) 



At this point, we invoke the following central limit theorem of Berry and Esseen Theorem 
1.2]. 

Lemma 1. Let Vi, V2, ■ ■ ■ be independent real random variables with zero means and finite third 
moments, and set 

n 
i=l 

Then there exists a universal constant C <1 such that for any n> 1, 



sup 

-oo<t<+oo 



Vi\Y,V,>(yA-Q{t) 



i=l 



i=l 



Towards evaluating C,n, we can bound -F„(p) in terms of Q{p), by applying Lemma [T] to 
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{- lnp{Zi) - {H{X) + Then for p > 0, we have 

CMh{X,X) 



Fn(0) < Q(0) + 



Fn{p) > 



1 CMuiX, A) 



Q(p) 



na|,(X,A)J 



(2.32) 
(2.33) 



and 



Fn(0)-F„(p) > 



Q{0)- 
'1 



CMh{X, A) 
na|,(X,A) ■ 

2CMh{X, A) 



Q(P) 



CMh(X,A) 



where = max{x,0}. Now plugging ( |2.32[ ) and ( |2.33[ ) into ( |2.30[ ) yields 

+00 



(2.34) 



en < 



1 CMHiX,X) 

2 + v^a|,(X,A) 



1 CMj^(X,A) 

2 + v^a3,(X,A) 



1 CMh(X,A) 

2 + y^a|(X,A) 



J y/nXaniX, A)e 



CMh(X,A) 



Qip)- 



Q{p)- 



CMh{X, A) 
y^ajj{X,X) 

CMh{X, A)" 



y^a|,(X,A) 



na|,(X,A)J 



dp 



2CMh{X, A) 
v^^x3,(X,A) 

2CMh(X, A) 
v^fT|,(X,A) 



+ 



+ 



1 (p+y?i:Ag^(X,A))^ , nA^<T|^(X,A) 



27r 



(ip 



' ^ + e ^ [Q(v^AaH(X, A)) - Q{p* + V^XaniX, A))] 



/^a|,(X,A) 
= ^H{X,X,n) 

where (5(p*) = ^^'(x'a') ' ^'^^ meanwhile plugging ( |2.34[ ) into ( |2.31| ) yields 



(2.35) 



+00 



J ^/nXauiX, A)e" 



1 _ 2CMh(X,A) 

2 ^^^^ v^a|,(X,A) 



dp 
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+00 



J \/n\aH{X, A)e 



+00 



1 _ 2CMh{X,\) 

2 ^aUXA)\ 



1 _ 2CMh{X,X) 

2 ^^^^ v^a|,(X,A) 



dp 



+00 



=e 2 e 



-v^A(T/i^(X,A)p 



dp 



p. 



e 2 (5(p* + VnA(T//(X, A)) 



(2.36) 



where Q{p,) = \- ^^^^f g'^j . Combining ( [2:291 ) with ( [235] ) and ( [236] ) completes the proof of 



part (a) of Theorem [2| 

Applying Lemma[T]to the IID sequence {— Inp(Xj)— iJ(X)}"^]^, we get ( [2.27| ). This completes 
the proof of Theorem [2| ■ 

Remark 1. Note that A = = 0(5). When A = $7(1) with respect to n, it can be easily 

verified that ^h{X, A, n) and (,^{X, A, n) are both on the order of by applying well-known 
inequality 

1 1 i2 , 1 1 t2 

e"^ < Qit) < -^=e-^. 



Meanwhile, on one hand, it is easy to see that 



t J 2 



vr 



iH{X,\n) < e 
On the other hand. 



)ia2ct|j(X,A) 



Q{V^XaH{X,X)) + 



2CMh{X, A) 
naj,{X,X)' 



^^(X,A,n) = e ^ Q{V^XaH{X,X))-e ^ 



1 -^^ 
e 2 ctp 



27r 



fiy\ ujj\^,^) riA^tT^(X,A) r \ (p+ynAcr^(X,A))^ 

e 2 (5(vnAcrH(X, A)) - e 2 / ^=e 2 



nA2CT^(X,A) 



e ^ Q{^XaH{X, A)) 



p2+2py?TA<Tjj-(X,A) 



27r 



dp 
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> e 2 Q{^/nXaH{X, X)) - / -^=e ^ dp 

/27r 







= e 2 Q v^Aag X,A - 

v/^(t|^(X, A) 

To further shed light on ^h{X, A, n) and C,^{X, A, n), we observe that 

< e ^ Q{V^XaH{X,X)) < 



V^ttV^XMX, A) + ^^l^,^^,^ V2nV^XaH{X, A) 

AnJ therefore, whenever X = o(l) anJ A = uj{n~^), 

e ^ Qiy^XaniX, X)) = e{^) = 00 



nX \ \/n 



which further implies 

_ n\^tT%(X.X) 

^H{X,X,n) = e —Q{V^XaH{X,X)){l + o{l)) 

n\^tTi,(X.X) 

i^{X,X,n) = e —Q{V^XaH{X,X)){l-o{l)). 



Remark 2. Another interesting observation from the proof of Theorem |2] especially ( |2.29| ), is 

the recursive relation between 



n J [ y/naniX) ^/nau^X) 



— ^X,n 



6 



naniX) 



and 



y/naH{X,X) 



As shown in the proof a proper bound on Fz^n{p) (using Berry-Esseen Central Limit Theorem) 
results in a bound ( |2.24| ) on Fx,n ( ^o-h(x) )- continue, we can apply this bound ( |2.24[ ) on 



Fz,n{p) to get another bound on Fx,n y ^a]i(x) )- Numerically, we can keep tightening the bound 
on Fx,n (^ y^o-^(x) ) ^^'■^ recursive manner until no significant improvement can be made. 

The probability that — Mnp(X") is away from H(X) to the left can be bounded similarly. 
Define 

A1(X) =sup I A > : / p^^\x)dx < ool . (2.37) 



Suppose that 

X*_{X)>0. (2.38) 
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Define for any 6 > 



rx-i5) =sup 

A>0 



X{5- H{X))-\n I p^+\x)dx 



and for any A G [0,X*_{X)) 

5_(A) 



[Ip''-'iy)dy] 



[\np{x)]dx + H{X) 



Then under the assumption ( |2.5[ ), 5- (A) is strictly increasing over A G [0, X*_{X)) with 5-(0) = 0. 
Let 

A*JX) = lim 5(\) . 

AtAl(X) 

Following the proof of Theorem [T| we have that rx-{5) is strictly increasing, convex, and 
continuously differentiable up to at least the third order inclusive over S E [0, A*_{X)), and 
furthermore 

r^_(5) = A(5 - H{X)) -In J p^+\x)dx 
with A = _ {5) satisfying 

(5_(A) = 5 . 



Define 



and 



al(X,\) 



P^^^^""^ \-\np{x) - {H{X)-64X))\^dx 



[Ip'^'iy)dy] 



6-{\))fdx . 



In parallel with Theorems [T] and [2[ we have the following result, which is referred to as the left 
NEP with respect to H{X) and can be proved similarly. 



Theorem 3 (Left NEP). For any positive integer n, 



Ft < lnp(X") < H{X) -6} <e 



n 



(2.39) 



Furthermore, under the assumptions ( |2.38| ) and ( |2.5[ ), the following also hold: 

(a) There exists a 5* > such that for any 5 G (0, 5*] and any positive integer n, 

1 



6' + 0{6') 



(2.40) 
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Pr<^ InpiX"") <H{X)-5\ <e . (2.41) 



and hence 

Pr I- 

n 

(b) For any 5 G (0, Al(X)) an J any positive integer n 

-nrx,-{S) 



^^,4^, A, n)e-"'---(^) > Pr lnp(X") < - ^1 > JX, A, 



n)e 

(2.42) 



where A = _ (5) > 0, anJ 



+ e ^^-^ [g(v^Aa^,_(X, A)) - Qip* + v^AaH,_(X, A))] (2.43) 

JX, A,n) = e f g(p, + v^Aa^,,_(X, A)) (2.44) 



(c) For any 6 < c\J~^, where c < aniX) is a constant, 

\(yH{X)J V^crj^iX) 
Remarks similar to those (Remark [T] and [2j) following Theorem|2]can be drawn here concerning 
Theorem [3l 

III. NEP With Respect to Conditional Entropy 

Consider now an IID source pair {X,Y) = {{Xi,Yi)}°Zi with finite conditional entropy 
H(X\Y), where H(X\Y) is the Shannon conditional entropy of Xi given Yi if X is discrete, 
and the conditional differential entropy of Xj given 1^ if X is continuous. Let p{x\y) be the 
conditional pmf or conditional pdf (as the case may be) of Xj given Yi, and p{y) the pmf or 
pdf (as the case may be) of Y^. By replacing — Mnp(X") with — Mnp(X"|F"), all results 
and arguments in Section [II] can be carried over to this conditional case, yielding the NEP with 
respect to H{X\Y). 

Specifically, define 



A*(X|F) =sup |a > : j p{y) j p-^+\x\y)dx 



dy <oo\ (3.1) 
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where / dy is understood throughout this paper to be the summation over the source alphabet 
of F if F is discrete. Suppose that 

\*{X\Y) > . (3.2) 

Let 

aUX\Y) ^ j j p{y)p{x\y)[-\np{x\y)fdxdy - H\X\Y) (3.3) 

which will be referred to as the conditional information variance of X given Y . It is not hard 
to see that under the assumption p.2| ), 

p{y)p-^+^[x\y) 




[/ / 'p{.'^)'P~^'^^{''^\''^)dudv\ 



hip{x\y)\ dx dy < oo 



(3.4) 



and 




p{y)p^'^^^ {x\y)dxdy < oo 
for any A G (0, \*{X\Y)) and any positive integer k. Further assume that 

ajj{X\Y) > and J J p{y)p{x\y)\\np{x\y)\^dxdy < oo . 
Define for any 5 > 

X{H{X\Y) + 5)-\n [ [ p{y)p-^+\x\y)dxdy 



rx\Y{^) =sup 

A>0 

and for any A G [0,A*(X|r)) 



5{\) 




p{y)p {x\y) 



[J Jp{v)p- 



A+l| 



U\V 



)dudv~\ 



lnp{x\y)]dxdy- H{X\Y) 



(3.5) 



(3.6) 



(3.7) 



(Throughout this section, 5{X) should be understood with its above definition.) Then under the 
assumptions ( |3.2[ ) and p.5[ ), 5(A) is strictly increasing over A G [0, A*(X|F)) with 5{0) = 0. Let 



A*(X\Y) = lim 6(X) . 

AtA*(X|Y) 

By an argument similar to that in the proof of Theorem [T| it can be shown that rx|y (5) is strictly 
increasing, convex and continuously differentiable up to at least the third order inclusive over 
5 G [0, A*(X|F)), and furthermore rx|y(5) has the following parametric expression 

rxwm)) = XiHiX\Y) + 5(A)) -^^J J piy)p-'^\x\y)dxdy 

with S{\) defined in ^ and A = r^|y(5). For any A G [0, \*{X\Y)), define 

p~^{x\y) 



(3.8) 



fxix,y) 



1 1 p{v)p^^^^{u\v)dudv 



(3.9) 
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aUX\Y,\) ^ j j h{x,y)v{y)v{x\y) \-\np{x\y) - {H{X\Y) + 5{\))\^ dxdy (3.10) 
Mh{X\YA) =11 fx{x,y)v{y)v{x\y)\-\xip{x\y) - {H{X\Y) + dim"" dxdy (3.11) 



where 5(A) is defined in ^J). Write Mh{X\Y, 0) as Mh{X\Y). It is easy to see that a'jj{X\Y, 0) = 
al{X\Y), aUX\Y,\) = 5'{\), ^nd 

Mh{X\Y) = j jp{y)p{x\y)\-lnp{x\y)-H{X\Y))fdxdy. (3.12) 

In parallel with Theorems [T] and [2| we have the following result, which is referred to as the 
right NEP with respect to H{X\Y) and can be proved similarly. 

Theorem 4 (Right NEP With Respect to H{X\Y)). For any positive integer n, 

Pr lnp(X"|r") > H{X\Y) + < e-"'-^!^^'^) (3.13) 



where X^ = X1X2 ■ ■ ■ X^ and = Y1Y2 ■ ■ - Yn. Moreover, under the assumptions (3.2) and 



(3.5), the following also hold: 



(a) There exists a 6* > such that for any 6 G (0, 6*] and any positive integer n, 



and hence 



Pr<'--lnp(X"|r") >iJ(X|r) + (5[^ <e "^2-^(^1^)"^°^'^'^^ . (3.15) 



n 

(b) For any 6 G (0, A*{X\Y)) and any positive integer n 



e^(X|F,A,n)e-"'^^i^('^) < Pr lnp(r"|X") > /7(X|F) + 

< ^H{X\Y,X,n)e-''^^^^'-^^ (3.16) 



where X = 'r^|y(5) > 0, and 

- 2CMh{X\Y,X) 
^H{X\Y,X,n) 



^aUX\Y,X) 

+ e'"' "''2'""'" [g(v^A(TH(X|F, A)) - g(p* + v^AaH(X|F, A))] (3.17) 

nX^r7%(X\Y,\) 

e^(X|y, A,n) = e 2 Q(p^ + v/^Aor^(X|r, A)) (3.18) 



with 0(0*) - ^^^(^1^'^) and 0(0)-^ - 
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(c) For any 5 < cy^, where c < (j/f(X|F) is a constant, 

\aH{X\Y)J v^a|^(X|r) [ ^ J 

The probability that — Mnp(X'^|F") is away from to the left can be bounded 

similarly. For completeness, we state the result without proof again. Define 

A1(X|F) =sup|a > : [ [ p{y)p^^\x\y)dxdy < oo\ . (3.20) 



Suppose that 

X'_{X\Y) > . (3.21) 

Define for any 5 > 

rx|y,-(5) =sup X{6-H{X\Y))-ln [ [ p{y)p^+\x\y)dxdy 

A>0 L J J 

and for any A G [0,X*_{X\Y)) 

^-(A) = / / rr r^/^?Cul' w , 1 [^Mx\y)]dxdy + h{x\y) . 

J J [J J p[v)p^+^[u\v)dudv\ 
(Throughout this section, 5_(A) should be understood with its above definition.) Then under the 



assumption p.5[ ), 5-(A) is strictly increasing over A G [0,A1(X|F)) with 5-(0) = 0. Let 



Al(X|r) = lim (5_(A) . 

AtAl{X|y) 

By using an argument similar to that in the proof of Theorem [TJ it can be shown that rx\Y,-{S) is 
strictly increasing, convex, and continuously differentiable up to at least the third order inclusive 
over 5 G [0, A*_(X\Y)), and furthermore rx|y,-(5) has the following parametric expression 

rx|y,-(5-(A)) = A(5_(A) - HiX\Y)) j j piy)p'^\x\y)dxdy 

with A = r'j^\Y-i^) satisfying 

5_(A) = 5 . 



Define 
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and 

,A+1 



MhAx\y,\) = [ [ rr r^/ wIm' w , 1 h^M^lv) - mxiY) - 54m' dxdy . 

J J [J J p[v)p^+\u\v)auav\ 
In parallel with Theorem |3} we have the following result, which is referred to as the left NEP 
with respect to H(X\Y) and can be proved similarly. 

Theorem 5 (Left NEP With Respect to H{X\Y)). For any positive integer n, 

Pr |--lnp(X'^|F") < H{X\Y) - 5) < e-'^'^^i^.-(^) . (3.22) 



n 

Furthermore, under the assumptions ( 3.21| ) and ( 3.5[ ), the following also hold: 



(a) There exists a 5* > Q such that for any 5 G (0, 5*] and any positive integer n, 
and hence 



f 1 I -n(^A +0(<53)) 

Pr<^ --lnp(X"|F") < /J(X|F) -5^ < e '^^^F^ (3.24) 

(b) For any 5 G (0, A1(X|F)) and any positive integer n 

jX|F,A,n)e-""^i^-('^) < Pr |-^lnp(F"|X") < i7(X|r) -(jj 

< eH,-(X|y,A,n)e-"''^i^.-(^) (3.25) 

where A = T^|y_(^) > 0, an J 

+ e ^ [g(v^Aor^f,_(X|r, A)) - g(p* + v^A(Th,-(X|F, A))] (3.26) 

^^_(X|r,A,n) = e Q(p, + v^Aa^^,_(X|r,A)) (3.27) 

(c) For anj 5 < Ci/— , where c < aniXlY) is a constant, 



n 
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Remarks similar to those (Remark [T] and [2j) following Theorem|2]can be drawn here concerning 
Theorem |4] and |5l 

Theorem |4] will be used in [[3| to show that for any binary input memoryless channel with 
uniform capacity achieving input X, random linear codes of block length n with either Elias' 
generator ensembles or Gallager's parity check ensembles can reach within 5 + rx|y(5) 



In 



2(l-C)Mff{X\Y,\) 



In n 
2n 



<yjj(x\Y,>.) — channel capacity while maintaining word error probability upper bounded 



by (eH(X|F,A,n)4 



2(l-C)Mj:j(X|y,A) - 
y^ajj(X\Y,X) . 



e-nrx|y(<5)_ 1^ particular, when 6 = ^2aa'jj{X\Y)\h^, the 



word error probability is upper bounded by 



-n 



+ 0{n 



-a In n ^ 



and the achievable rate 



2\/ na In n 

(in nats) of random linear codes of block length n with either Elias' generator ensembles or 

+ OC-^) of 



+ [a 



1 N In n 
2/ 



Gallager's parity check ensembles is within ^y2aajj{X\Y) 
the channel capacity; when 5 = ^ for any c, the word error probability is upper bounded by 



Q 



I Mh{X\Y) 



and the achievable rate (in nats) is within 



2n n 



{1-C)Mh{X\Y) 
aUX\Y) 



aH(X\Y)^ 

of the channel capacity. 

We conclude this section by illustrating rx|y(5) and aj^{X\Y) when X and Y are the uniform 
input and the corresponding output of the binary symmetric channel (BSC) and the binary input 
Gaussian channel. 



Example 1 (BSC): Combining ( |3.7| ) and ( |3.8[ ), it is not hard to verify that 



rx\Ym)) 





P{x, y)f\{x, y) In /a(x, y)dxdy 
P{x, y)fx{x, y) In ^^^^^^y^^^'^^ dxdy 



p{x\y) 



= D{p{x\y)fxix,y)\\p(x\y)) 
For BSC, simple calculation reveals that 

p{x\y) = 



(3.29) 



1 — p if X = y 
p otherwise 



and 



By defining 



(i-p)- 



-A + l 



P{x\y)fx{x,y) 



Diq\\p) 



p-A+l + (i_p)-A + l 
p-A+l+(l_p)-A + l 

1 



if X = y 
otherwise 



(3.30) 



(3.31) 



q) In 



P 



gin 



P 
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and p.29[ ), we have 

rx\Y{S{X)) = D 



-A+l 



P 



D p + 



p-X+l + (1 _p)-A+l 

p{i -p){p~^ - (1 -py^) 



p-\+l + (1 _p)-A+l 

On the other hand, by substituting p.30D and ( |3.31D into ( |377| ), 

^^^^ ^ p(l - (1 -P)'^) i^^-P 



P 



and eventually, we have 



rx|y(5) = I5|p+j^ 



P 



P 



(3.32) 



(3.33) 



(3.34) 



and plugging p.30[ ) into p.lO| ) with A = yields 

al{X\Y) = {l-p)ln^{l-p)+p\n^p-[-p\np-{l-p)\n{l-p)f 
= p{l — p) In^ 



2 1 -P 



P 



(3.35) 



Moreover, as X and y are both finite alphabets, it is easy to show that X*{X\Y) = oo, where 

1 — p 



X*{X\Y) is defined in ([3J]). Then 



and 



A*(X|y) = lim (5(A) = (1 -p)ln 

At+cx) p 



lim rx|-K(5) = — Inp 



<5tA*(X|y) 

Based on Theorem |4| A*{X\Y) and rmax can be interpreted in the following way. As 

1 



(3.36) 



(3.37) 



max In p(a:"|?/") = — Inp, 



then 



5^A*(X|y) 



lim Pr<^ lnp(X"|F") > /7(X|r) + 5 



n 



Vi{ lnp(X"|r") 

1^ n 



Inp 



In addition, for 6>A*{X\Y), 



Pr <{ -- lnp(X"|F") > if(X|r) + 5 J> = 0. 
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rx\',(S) VS. <5 when p=0.10 




Fig. 1. rx\Y{5) for BSC 



By adopting the convention that OlnO = and e °° = 0, 



rx\Y{^) 



Ini^ 

p 



(3.38) 



if 5g [o,A*(x|r)) 

-oo if5>A*(X|F) 

A sample plot of rx|y((5) is provided in Figure [T| when p = 0.10. 

Example 2 (Binary Input Gaussian Channel): Without loss of generality, we assume that the 
input of channel is modulated to {+1,-1}, and therefore 



p{y\x) 



2'Ka 



-e 2<t2 



(3.39) 



for X = {+1, —1}, where cr^ is the variance of the noise. Calculation of rx|y (5) and aj^{X\Y) is 
much more involved than that for BSC. Tedious evaluation is omitted here with results presented 
as follows. Let f/ be a standard Gaussian random variable, i.e. 



p(u) 



27r 



e 2 



and define 
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Then 



5(A) 



E 



aU+1 



aU+l 



and 



4(x|r) = E 



In'' g 



E 



In 5- 



(Tf/+ 1 



(3.40) 
(3.41) 

(3.42) 



To get better understanding of those quantities, let us first determine \*{X\Y) and A*(X|F). 
In fact, we can show that \*{X\Y) = oo by verifying that 



p{y) 



dy < OO 



for any finite A > 0. Towards this, observe that 



piv) 



J2p~'^\^\y) 



dy 



is an increasing function with respect to A since p{x\y) < 1 for any x and y. Therefore, 



p{y) 



dy 



E 



< E 



9 



aU + 1 



< oo 



as 



E\e 



sUi 



e 2 < OO 



for any finite s. Now let us show the claim A*{X\Y) = oo. According to p.40[ ). 



5(A) 



E[,M^)ln^m] 



E [g^ (^)] 



-E 



In^i 



aU + 1 



d 

dX 



InE 



9 



aU + 1 



H{X\Y) 



As H{X\Y) is a constant and always less than In 2, the claim A*(X|F) = oo is equivalent to 
show 



dX 



InE 



9 



aU + l 
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is unbounded when A — oo. By the fact that 6{\) is an increasing function of A, which also 
implies that so is 



^^^^ 



aU + l 



a2 



we only have to verify that 



lnE[^^+^(^)]-lnE[/(^)] _ E [/+^ (^)] 

k + i-k ^bH"^)] 

or simply 

E b'^ (^)] 

is unbounded when /c — > oo, which is indeed the case as 



E [g'^' (^)] _ 

EbM^)] , fk 



e e 



2fc2_2fe 



= e (^e^ j ^ oo 
as — )■ oo. And consequently, it is not hard to see that 

rx\YiS) -> oo 

as (5 — 7- oo. The interpretation based on Theorem]?] is as follows: 

1 



n 



lnp(x"|i/") -i/(X|F) 



can approach oo for proper choice of a;" and y", but 

lim Pr I -- lnp(X"|y") > H(X\Y) + 6} = e''^ = 0. 
s-^oc \^ n J 

Figure ]2] shows a sample plot of rx|y(5) for BIGC with a = 1.0. 

IV. NEP With Respect to Mutual Information and Relative Entropy 

Consider now an IID source pair {X,Y) = with finite mutual information 

/(X; Y) > 0. Let p{y\x) be the conditional pmf or pdf (as the case may be) of Yi given Xj. In 
this section, we extend the NEP to /(X; Y) and relative entropy. 
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Fig. 2. rx|y(cS) for BIGC 



A. NEP With Respect to I{X; Y) 

We begin with the left NEP with respect to /(X; Y). Define 



X*_{X;Y) ^supa>0: / p{x,y) 




piy\x) 
. p{y) 



-A 



dxdy < oo 



Suppose that 



Let 



X*_{X;Y) > . 



/ p{x,y) 




In 



piy\x) 

p{y) 



1 2 



dxdy-l\X;Y) 



(4.1) 



(4.2) 



(4.3) 



which will be referred to as the mutual information variance of X and Y. It is not hard to see 



that under the assumption (4.2), 




p{x,y) 


p{y\x) 

_ piy) 


1 


f fp{u,v) 


p{v\u) 
p{v) 


-A 

dudv 



-In 



p{y\x) 



p{y) 



dxdy < oo 



(4.4) 



and 




p{x,y) 



'p{y\x) 
. p{y) 



dxdy < oo 
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for any A G (0, \*_{X; Y)) and any positive integer k. Further assume that 



(t]{X;Y) > and 




p{x,y) 



In 



p{y\x) 



p{y) 



dxdx < oo. 



Define for any 5 > 



rx-Y-{5) =sup 

A>0 



A(5-J(X;F))-ln / p{x,y) 




' p{.y\x) 
. p{y) 



dxdy 



(4.5) 



(4.6) 



and for any A G [Q,\*_{X-Y)) 

f-x{x,y) 



A 


p{y\x) 

_ p{y) 


-A 


J fp{u,v) 


p{v\u) 


-A 

dudv 



5_(A) 




p{x,y)f-\{x,y) 



-In 



dxdy + /(X; r) . 



(4.7) 
(4.8) 



(Throughout this section, (5_(A) should be understood with its above definition.) Then under the 
assumptions ( |42l ) and ( |431 ), 5- (A) is strictly increasing over A G [0, A1(X; Y)) with 5_(0) = 0. 
Let 

A*_(X;F)= lim 5_(A) . 

AtAl{X;y) 

By an argument similar to that in the proof of Theorem [TJ it can be shown that rx-x-{5) is 
strictly increasing, convex and continuously differentiable up to at least the third order inclusive 
over 5 G [0, A!_(X; F)), and furthermore rx;y,-(5) has the following parametric expression 




rx;y,-(5_(A)) = A(5_(A)-/(X;r))-ln / j p{x,y) 
with A = r^.y„(5) satisfying 



' p{.y\x) 
. p{y) 



-I -A 



dxdy 



(4.9) 



5_(A) = 6 . 



Further define for any A G [0, \*_{X; Y)) 



al_{X;Y,\) 




f-xix,y)pix,y) 



M,,_(X;r,A) J U{x,y)p{x,y) 



In 



In 



p{y\x) 
p{y) 

p{y\x) 
p{y) 



(J(X;F)-5_(A)) 



-(J(X;F)-5_(A)) 



dxdy 



(4.10) 



dxdy. (4.11) 



Write M/,_(X;F,0) simply as Mj{X;Y). It is easy to see that aj_{X;Y,0) = a]{X;Y), 
a]_{X;Y,X) = 6'_{X), and 

p{y\x) 



Mj{X;Y) = J Jp{x,y) 



In- 



p{y) 



I{X;Y)) 



dxdy . 



(4.12) 
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In parallel with Theorems |3] and |5} we have the following result, which is referred to as the 
left NEP with respect to I{X; Y) and can be proved similarly. 

Theorem 6 (Left NEP With Respect to I{X; Y)). For any positive integer n, 

Pr l-ln ^^^"*^"^ < IiX-Y)-5 \ < e-"'-^^^.-('^) . (4.13) 
Furthermore, under the assumptions ( |4.2[ ) and ( |4.5| ), the following also hold: 



(a) There exists a 5* > such that for any 5 G (0, 5*] and any positive integer n, 



p i,_^p(yix-) ^ _a ^ ,-"(^«<^')) . (4.15) 



and hence 



n p{Y 

(b) For any 5 G (0, /S.*_{X; Y)) and any positive integer n 

( 1 r)(Y'^\ X'^\ 

< li^.{X-Y,\n)e-'''''^'''--^^^ (4.16) 

where A = Txy-(^) > 0, an J 
^i,~{X;Y,X,n) - 



V^al_iX;Y,X) 

+ e ""2' [Q(v^Aor,,_(X; F, A)) - Q{p* + v^Aa,,„(X; Y, A))] (4.17) 

^j_iX;Y,X,n) = e ^ Q(p, + v^A(T/,_(X; F, A)) (4.18) 

V7 *A CA4i _(X;Y,X) r ^/ x 1 2CAfj _ (X;y,A) 

^^^^^ Q(P ) = V^af;_(X;y,A) Q(P*) = 2 - V^al_(X-X,X) - 

(c) For any 6 < C\J~^, where c < (Ji{X; Y) is a constant, 

„(_SVn_\ CM,(X,Y) [1 p(y|A-) ^ 



< n(^^\+£}mXl (4 19) 

The probability that ^ m p(y„) is away from /(X; Y) to the right can be bounded in a 
similar manner. For completeness, we state these bounds again without proof. Define 



A*(X;r) ^sup<'A>0: / I p{x,y) 




. p{y) 



dxdy < oo ^ . (4.20) 
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Suppose that 



Define for any 5 > 



rX;YiS) =SUp 
A>0 

and for any A G [0,X*{X;Y)) 



X*{X;Y) > 



A(/(X;F) + 5)-ln / p{x,y) 




. p{y) 



dxdy 



(4.21) 



(4.22) 



fx{x,y) 



A 


p{y\x) 

_ p(y) 


A 


J Jp{u,v) 


p{v\u) 


A 

dudv 



5(A) 




P{x,y)fx{x,y) 



In 



piy\x) 

p{y) 



dxdy - I{X] Y) . 



(4.23) 



(4.24) 



(Throughout this section, 5{\) should be understood with its above definition.) Then under the 
assumptions ( |4.21[ ) and ( |43| ), 5(A) is strictly increasing over A G [0,A*(X;F)) with 5(0) = 0. 
Let 

A*(X;r)= lim 5(A). 

AtA*(X;y) 

By an argument similar to that in the proof of Theorem [T| it can be shown that rx;y(5) is 
strictly increasing, convex and continuously differentiable up to at least the third order over 
5 G [0, A*(X;F)), and furthermore rx;y(5) has the following parametric expression 

p{y\xy^ 




rx;y (5(A)) = A(J(X; Y) + 5(A)) - In / / p(x, y) 
with A = r'x.yi^) satisfying 

5(A) = 5 . 



p{y) 



dxdy 



(4.25) 



In 



In 



(J(X;F) + 5(A)) 



(/(X;F) + 5(A)) 



Further define for any A G [0, A*(X; Y)) 

aKX;Y,\) ^ j j Mx,y)pix,y) 

Mj{X;Y,\) ^ j j Mx,y)p{x,y) 

It is easy to see that aj{X; Y, 0) = aj{X; Y) and aj{X; Y, A) = 5'(A). 

In parallel with Theorems [T| [2| and |4| we have the following result, which is referred to as 
the right NEP with respect to /(X; Y) and can be proved similarly. 



p{y) 

p{y\x) 
p{y) 



dxdy 



dxdy 



(4.26) 
(4.27) 
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Theorem 7 (Right NEP With Respect to /(X; Y)). For any positive integer n, 

{ ^ 1- ^-^^^ > Y) + 5^< e-"--(^) . (4.28) 

Furthermore, under the assumptions ( |4.21| ) and ( |4.5[ ), the following also hold: 

(a) There exists a 6* > such that for any 6 G (0, 6*] and any positive integer n, 



i ,„ > r) + 4 < e-"'^*°""» . (4.30) 



n p{Y 

(b) For anj 6 G (0, A*(X; F)) and any positive integer n 

e,(X;y,A,n)e-"'--^-(^) < Pr | - In ^^^-^^i^ > /(X; F) + 5 

< e/(^;>",A,n)e-""^^^(^) (4.31) 

where A = J^x-yl*^) > 0' '^^'^ 

- 2CM,(X;F,A) 
0(X;y, A,n) - 

I 

nA2CT|{X;y,A) 



/naf(X;y,A) 

+ e"" "2"'"-' [Q(v/^Aa,(X; F, A)) - g(p* + v^Aa,(X; F, A))] (4.32) 
e/X; r, A, n) = e ^ Q{p. + v^A(T7(X; F, A)) (4.33) 



with Qip*)= ^/^A}x-Yx) 



In n 



(c) For anj 5 < cy where c < aj{X; Y) is a constant, 

of \ ^1 < pJl,„ Py"l^") . iix-Y) I / 



Remarks similar to those (Remark [T] and [2]) following Theorem |2] can be drawn here concerning 
Theorems |6] and U\ 
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B. NEP With Respect to Relative Entropy 

The IID source pair {X,Y) = {(XjjFj)}^]^ considered so far is arbitrary. Let us now focus 
on the case in which the source X is discrete, but Y could be either discrete or continuous. Let 
V denote the set of all probability distributions over the source alphabet X. For any t E V, let 

A 



Qtiy) = ^t{x)p{y\ 

n 



x] 



i=l 



and 



qt{y) 

p{y\x) 



I{t-P) =Y,t{x) I p{y\x)\n 



Qtiv) 



dy 



(4.35) 

(4.36) 
(4.37) 

(4.38) 



where = yiy2 ■ ■ - yn, and P = {p{y\x)} represents a channel with p{y\x) as its transitional 
pmf or pdf (as the case may be). Clearly, D(t,x) is the divergence or relative entropy between 
p{y\x) and qt{y)', and I{t;P) is the mutual information between the input and output of the 
channel P when the input is distributed according to t. To be specific, we denote the pmf of 
each Xi by px- Without loss of generality, we assume that pxix) > for any x e X. Since 




pix,y) 



' piy\x) 
. p{y) . 



dxdy = ^px{a) / p{y\( 



Y.b&xPx{h)p{y\h) 
p{y\a) 



dy 



it is not hard to see that for any A > 0, 

J j Pix,y) 

if and only if 



'p{y\x) 



p{y\a) 



. p{y) 
T.bGxPiy\br^ 



dxdy < oo 



p{y\a) 



dy < oo 



for any a E X. Therefore, \*_{X; Y) defined in ( |4.1| ) is also equal to 



sup i A > : j p{y\a) 



p{y\a) 



-I -A 



dy < oo, a E X 



for any t E V with t(a) > for any a E X (such t E V will be said to have full support). 
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Define for any t G V with full support and any 6 > 



r-{t, 5) = sup 



A>0 



A(5-J(t;P))-5^t(x)ln / p{y\x) 



p{y\x) 



dy 



and for any A G [0, \*_{X; Y)) and any t eV with full support 

f-x{y\x) 



A 


p{y\x) 
_ Qtiy) _ 


-A 


fp{v\ 


x) 


p(v\x) 


-A 

dv 



It is not hard to verify that 



D{t,x,X) = J p{y\x)f-.xiy\x) 
64t,X) =^t(x) J p{y\x)f^x{y\x) 



In 
In 



p{y\x) 
Qt{y) 

p{y\xy 
Qtiy) . 



dy 

dy + I{t;P). 



and 



dS-jt, A) 
dX 



J2tix 

xeX 



5.{t,0) = 

p{y\x)f-x{y\x) 
piy\x)f-xiy\x) -In 



-In 



p{y\x) 
Qtiy) 



dy 



p{y\x) 
Qtiy) 



dy 



x&X 

> 



p{y\x)f^x{y\x) 



In 



p{y\x) 



dy - D^{t,x,X) 



(4.39) 



(4.40) 

(4.41) 
(4.42) 



where the last inequality is due to ( |4.5| ). Therefore, (5_(t, A) as a function of A is strictly increasing 
over A G [0, X*_{X;Y)). Let 

Al(t) = lim (5_(t,A). 

AtAl(X;y) 

By an argument similar to that in the proof of Theorem [T| it can be shown that r_(t, 5) is strictly 
increasing, convex and continuously differentiable up to at least the third order inclusive over 
5 G [0,A*_{t)), and furthermore r^{t,S) has the following parametric expression 

r_(t, <5_ {t, A)) = A(5_ (t, A) - I{t; P)) - t{x) In / p{y\c 



X 



piy\x) 
. Qtiy) 



dy (4.43) 
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with 



satisfying 



dr_{t,5) 
85 



S.{t,X) = s . 



Further define for any A G [0, \*_{X; Y)) 



and 



Mn,-{t;P,X) 



X] 



p{y\x)f-x{y\x) 



p{y\x)f^x{y\x) 



In 



In 



p{y\x) 
qt{y) 

p{y\x) 
Qtiy) 



- Dit,x,X) 



D{t,x,X) 



dy 



dy 



(4.44) 



(4.45) 



Write al _{t; P, 0) simply as aUt] P), MD,-{t; P, 0) as Moit; P), a^px] P) as al{X; Y), and 
Md{px; P) as MoiX; Y). It is not hard to see that 



al{t;P) = J2t 
c^UX;Y) = J2pi 

xex 

MniX;Y) ^^^^ 



x] 



X] 



x&X 



p{y\x) 
p{y\x) 
p{y\x) 
p{y\x 



In 



p{y\x) 



In 



In 



Qtiy) 

p{y\x) 



J p{y\x) In— j-ydy 



Qtiy) 

p{y\x) 



In 



p{y) 

p{y\x) 
Qtiy) 

p{y\x) 



J p{y\x)\n—-^dy 



piy) 



p(v\x) In '^^^^-j^dv 
Qtiy) 

p{v\x) m — ——dv 
piv) 



dy 



dy 



and 



For obvious reasons, we will refer to al){t; P) (cr|)(X;F), respectively) as the conditional 
divergence (or relative entropy) variance of P given t {Y given X, respectively). 

In parallel with Theorems [3| [5| and |6| we have the following result, which is referred to as 
the left NEP with respect to relative entropy. 
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Theorem 8 (Left NEP With Respect to Relative Entropy). For any sequence x" = Xi ■ ■ ■ x„ 
from X, let t E V be the type of x", i.e., nt{a), a E X, is the number of times the symbol a 
appears in x". Assume that t has full support. Then 



Pr 



1 p(r"|X" 



111- /^^ N 



< I{t-P)-5 



X 



x" V < e 



-nr^ {t,5) 



(4.46) 



Furthermore, under the assumptions ( 4.2[ ) and ( 4.5[ ), the following also hold: 



(a) 



(c) 



There exists a 6* > such that for any 6 G (0, 6* 

1 



2aUt;P) 



6' + 0{6') 



and hence 



Pr 



1 p(r"|X" 



< I{t; P)-S 



X" = x" S> < e "^^^^^(^ 



+0{53)) 



(4.47) 



(4.48) 



(b) For any 5 G (0, A*_{X; Y)) 



e^_(t;P,A,n)e-"^-(*'^) < Pr | ^ 



< I{t; P)-5 



X 



X 



where A 



< eD,-(t;P,A,n)e-"^-(*'^) 

> 0, and 

2CMz3,-(t;P,A) 



(4.49) 



95 

^D~{t;P,X,n 



nX^a-^ _(t;P,A) 



+ e 2 
e„_(t;P,A,n) = e 



v/^a|,_(t;P,A) 
[Q(v^AaD,- (t; P, A)) - Q{p* + V^Xa^,- {t; P, A))] (4.50) 
g(p* + v^AaD,-(t;P,A)) (4.51) 



nX'^a'jy _(t;P,A) 



For any 6 < C\/—, where c < a^it; P) is a constant. 



Q 



6\/n 



CMojt; P) 
aD{t;P)J ^al{t;P) 



< Pr 



< Q 



1 



< I{t;P)-S 



X" = x^ 



CM^(t;P) 
aD{t;P)J ' y/^alit-Py 



(4.52) 
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Proof of Theorem ^ The inequality ( |4.46[ ) comes from the Chemoff bound. To see this is 
indeed the case, note that 

1 p(F"|X^ 



Pr <^ - In 



< lit] P)-5 



E 



< inf 

A>0 



p(y"|x") 



-A 



onX{5-I{t;P)) 



n 



inf 

A>0 



'Piy\a) 

Qt{y) 



dy 



nt{a) 



on\{S-I{t;P)) 



inf exp < —n 

A>0 ' 



I- 



A(5-/(t;P))_^t(a)ln p{y\a){ 



( p{y\a) 
Qtiy) 



dy 



-nr- {t,S) 



(4.53) 



which completes the proof of ( |4.46[ ). 

The equation ( |4.47[ ) follows from the Taylor expansion of r_ (t, 5) at 5 = and the fact that 

d^r.{t,6) _ 1 
d6^ ~ ■ 



What remains is to prove ( |4.49| ) and ( |4.52[ ). To this end, let 

n 

/_A(l/"|x") = n/-A(l/.k.). 



i=l 



With A 



dr-{t,S) 

dS ■ 



it follows from (4.43) that 



.(t, 6) = X{6 - I{t; P)) - t{x) In f p{y\ 



XI 



. Qtiy) 



-A 



dy . 



Then we have 

1 , p(F'^|X") 



Pr <^ - In 



X" = a;" 



n qt(y") 



'<Iit;P)-S 



I 



A In 



Mn£m<7(t;P)-5 
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-nr— {t,5) 



I 



A In 



lnE^£^-n{I{t;P)-5)<0 



-nr— {t,S) 



I I 



p<0 i„Pi^^_„(,(,.p)_,)^ 



-nr— {t,5) 



U 



(4.54) 



where 



y/naD-(t] P, A) 

and Zj takes values over the alphabet of Y according to the pmf or pdf (as the case may be) 
f-x{z\xi)p{z\xi). It is easy to verify that 

p{Zi\xi 



and 



1=1 



E 



In 



p{Zi\xi] 
<lt{Zi) 



= D{t,Xi, A) 

n 
i=l 

n t{x)D(t, X, A) 
n{I{t;P)-6) 



which further implies that 



F,n(p) =Pr 



Applying Lemma [T] to the independent sequence 

1 p{Zi\xi) 

In — - D{t,Xi,X) 
the argument similar to that in the proof of Theorem [2] can then be used to establish ( |4.49[ ). 
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Finally, consider another sequence of independent random variables Wi, W2, ■ ■ ■ , Wn, where 
Wi takes values over the alphabet of Y according to the pmf or pdf (as the case may be) p{w\xi). 
Applying Lemma [T] directly to 

piWi\xi) 



In 



- D{t,Xi 



i=l 



we then get ( |4.52[ ). This completes the proof of Theorem [8j 



The conditional probability that given X^' 



X 



n 1 p(y"|X") 



is away from /(t; P) to the 



right can be bounded similarly. For completeness, we state these bounds below without proof. 
Define for any t eV with full support and any 5 > 



r(t, 5) = sup 



A>0 



A(/(t;P) + (5)-5^t(a;)ln f p{y\x) 



' p{y\x) 
. Qtiy) 



1 A 



dy 



(4.55) 



and for any A G [0, A*(X; Y)) and any t eV with full support 

fx{y\x) 



A 


p{y\x) 


A 


Jp{v\ 


x) 


p(v\x) 


A 

dv 



D+{t,x,X) = / p{y\x)fx{y\x 



In 



p{y\x) 



6it,X) =5^t(x) f p{y\x)My\ 



X] 



In 



(it{y) 

p{y\x 



dy 



(it{y) . 



dy-I{t-P) 



(4.56) 

(4.57) 
(4.58) 



Then under the condition ( |4.5[ ), 5(t, A) as a function of A is strictly increasing over A G 
[0, A*(X; Y)) with 5{t, 0) = 0. Let 

A*m = lim 5(t,\) . 

AtA*(X;y) 

By an argument similar to that in the proof of Theorem [T] it can be shown that r(t, 5) is 
strictly increasing, convex and continuously differentiable up to at least the third order over 
5 G [0, A*(t)), and furthermore r(t,5) has the following parametric expression 

'p{y\xy^ 

lyxjni I pyyyx) 

xex 

with 

A 



it, 6{t, A)) = X{I{t; P) + 6{t, A)) - Yl ^(^) / ^(^1 

xex 

dr{t,6) 



qt{y) 



dy 



(4.59) 



satisfying 



85 



S{t,X) = 6 
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Further define for any A G [0, A*(X; Y)) 



x] 



piy\x)fxiy\x) 



and 



Mn{t-P,\) =^t(x) 



piy\x)fxiy\x) 



In 



qt{y) 

piy\x) _ 
Qtiy) 



D^(t, X, A) 



dy 



D+it,x,\) 



dy 



(4.60) 



(4.61) 



Then the following result can be proved similarly, which is referred to as the right NEP with 
respect to relative entropy. 

Theorem 9 (Right NEP With Respect to Relative Entropy). For any sequence = xi ■ ■ ■ x„ 
from X, let t E V be the type of x", i.e., nt{a), a E X, is the number of times the symbol a 
appears in x". Assume that t has full support. Then 

1 



Pr <; - In , 



> lit- P) + 5 



X 



n 



x'' )■ < e 



-nr{t,5) 



(4.62) 



Furthermore, under the assumptions ( |4.21| ) and ( 4.5[ ), the following also hold: 



(a) 



There exists a 6* > such that for any 6 G (0, 6* 

1 



r(t,5) 



6' + Oi5-') 



and hence 



Pr<! -\n\^>I{t;P)+6 



X" = x" ^ < e '^^^^ 



(4.63) 



(4.64) 



(b) For any 5 e iO,A*{X;Y)) 



1 r)(Y^\X'^) 

|,(t;P,A,n)e-(*'^) < \ > P) + 5 



X"" = x" 



< ^D{t;P,X,n)e 



'nr{t,5) 



(4.65) 



where A 



dr{t,5) 
dS 



> 0, and 



^D{t;P,X,n 
+ e 



2CMD{t-P,X) 



n\^cr'^(t;P,\) 



[Qi^/^Xanit] P, A)) - Q{p* + v^Aa^lt; P, A))] (4.66) 

nX'^a'l,(t;P.\) 

P, A, n) = e ^Q{p, + V^Xaoit; P, A)) (4.67) 



1 _ 2CMo{t;P,X) 

2 v^f7|,{i;P,A)- 
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(c) For any 5 < cy^, where c < ^^{t] P) is a constant, 



Remarks similar to those (Remark [T] and [2]) following Theorem |2] can be drawn here concerning 
Theorems [8] and |9] Theorem [8] will be used in [j3| to establish a non-asymptotic coding theorem 
for Shannon random codes. 

V. NEP Application to Fixed Rate Source Coding 

Assume that the source alphabet X is finite. In this section, we make use of the NEP with 
respect to H{X) to establish a non-asymptotic fixed rate source coding theorem, which reveals, 
for any finite block length n, a complete picture about the tradeoff between the minimum rate 
of fixed rate coding of Xi ■ ■ ■ X„ and error probability when the error probability is a constant, 
or goes to with block length ri at a sub-polynomial n^^, < a < 1, polynomial n^", a > 1, 
or sub-exponential e~"", < a < 1, speed. We begin with the definition of fixed rate source 
code. 

Definition 1. Given a source from alphabet X, a fixed rate source code with coding length n is 
defined as a mapping i : — )■ {1, 2, . . . , |S'„|}, where Sn is a subset of X"^. The performance of 
the code is measured by the rate Rn = Mn I^Snl (in nats) and error probability Pr {X" ^ Sn}- 

As can be seen from the definition, the design of a fixed rate source code is equivalent to 
picking a subset of X^. Given the source statistics p{x), one can easily show that the optimal 
way to pick Sn is to order in the non-increasing order of and include those x"' with 

rank less than or equal to \Sn\- Then we have the following non-asymptotic fixed rate source 
coding theorem. 

Theorem 10. Let Rn{en) denote the minimum rate (in nats) of fixed rate coding of X1X2 ■ ■ ■ Xn 
subject to the error probability not larger than e„. Under the assumptions ( |2.2| ) and ( |2.5| ), for 

any n and e„ > 0, 



-d + ln 

S > Rn{en) - H{X) >6~rx{6) + 



d \ 2CMh{X,\) 



n 



(5.1) 
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for any constant d satisfying \ — Q ( ^anix a) ) ~ "^^^"(x'x) ^ ^^^^^ ^ solution to the 
equation 

en = ^H{X,r'^{S),n)e-^^-^'^ (5.2) 

S is the solution to the equation 

(1 + e-") e„ = r:^(5), n)e— (5.3) 

and A = r^(5). /n particular, the following hold, depending on whether e„ is a constant, or how 
fast e„ goes to 0. 
(a) When e„ decreases exponentially with respect to n. 



(mi,) / In en lnn\ Inen 

(5.4) 

where r^^^\-) is the inverse function of rx{-)- 

(b) When = n-fe"'^" /or a G (0, 1), 

^/2aH{X)n~'^ +0(71-"-^) > Rn{er,)-H{X) 

> V2aH{X)n-^ - O (n-'-^^ (5.5) 

for a e (0, 1), and 

V2(7j^(X)n-'^+0(n-(^-")) > Rn{en)-H{X) 

> V2aHiX)n-'-^ - O (n-^^"")) (5.6) 

for a e [|, 1). 

(c) W/jen Sn = /br a > 0, 



> MX),/^-OlW— |. (5.7) 
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(d) When e„ = e remains a constant, 



> Rn{en)-H{X) 



/n \ n 

where Q^^ {■) is the inverse function of Q {■). 



Proof of Theorem 10' Define 

SJ6) = |x" : -- lnp(x") < H(X) + 6 
I n 

and 

e„(5) = Pr{X"^5„(5)}. 
Clearly e„((5) is a non-increasing function of 5. Now let 5 and 5_ satisfy that 

According to the discussion on optimal fixed-rate source codes, 

-ln^„(5) </?„(e„) < -ln^„(5). 
n n 



Observe that 



which implies that 



< 1 



Rn{en) < - \n\Sni6)\<H{X) + 5. 
n 



Towards the lower bound on Rn{en), further define 



c/) = <! x" : H{X) + 6-- < -- lnp(x") < H{X) + 6 

n n 
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for some constant d > 0. Then we have 



x"£S„{5,d) 

= E /a"'(^)/a(^"M^") 

x"&Sn{S,d) 

a;"e5„{(5,d) 



-nrjf (5) 



Pr I -- < - V - lnp{Z,) - {H{X) + 5) < 1 
In n ^—^ I 



> e 



naniX, A) 



2CMh(X, A) 
^orf,(X,A) 



where A = r'j^{S), {Zi}^^-^ are IID random variables with common pmf fx{z)p(z), and the last 
inequality is due to the direct application of Lemma [T] (Berry-Esseen Central Limit Theorem) 
to {-\np{Zi) - (iJ(X) + And therefore 

Rnien) > -ln|^„(5)| 
n 

> -ln|5„(5,d)| 



n 



> H{X) + 6 
'1 



d 



n 



+ - In 

n 



rx{5) 
d 



2 ^ V^/^fTi^(X,A) 



2CMh{X,\) 



(5.12) 



Note that ^ - Q 



- = {^) for constant d > 0. Then §1^ is proved 

by showing S and 5 calculated according to ( |5.2[ ) and ( |5.3[ ) indeed satisfy ( |5.9[ ), where we invoke 
Theorem [2[ i.e. 

e„(5) = Pr{X"^5„(5)} 

< e^,(X,r:^(5),n)e-"'^-(^") 
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while 

en{5) = Pi {X^^S„.m 

> e„. 

Let us now look at special cases. 

(a) When e„ decreases exponentially with respect to n, i.e. e„ — > c as n — +00 for some 
constant c < 0, we have 

lne„ ln^HiX,r'x{3),n) 



n n 

Note that 



rx{S). (5.13) 



iniX, X, n) > 3 = Q 



'n ^ 

Taking n — )■ +00 in ( |5.13[ ), it can be seen that rx{5) — )■ — c. And therefore, ^h{X, r'-^{5),n) 



^ (^)' ^^^'^^ further implies that 



- _ (^inv) ( lne„ ^ ln^H(X,r^((5),n) 



n n 



In en In n 



(mi,) / men inn \ , ^, , 

On the other hand. 



^J^^^J^^^±^J;^M^y^®:^_,^(,y (5.15) 

n n n 

and by the same argument, rx{S) — )■ — c as n — )■ +00. Consequently, ^^(X, r^(5), n) = 
® ( 7n ) ' which further implies 



n ~ 2n 

and 

S = r 



-rx®-^ + 0(n-i) (5.16) 



^inv) ( lne„ lnn\ 
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Combining (jSTT) with ( [STT?] ), ( [5TT61 ) and ( [5717] ) yields, 



r 



(inv) 
X 



In e„ In n 



n 2n 



r 



(inv) 
X 



2n 

In en \nn\ lne„ _^ 

H Oin ) 

n 2n / n 



(5.18) 



This completes the proof of ( |5.4[ ). 
(b) First of all, let us consider the case when a E (O, |). Towards proving ( |5.5| ), let us show 
that 5 = \/2aH{X)n^^ + rjn'^ for some properly chosen constant rj will guarantee 



eniS) < n~^e-^ 

By Theorem [2] and Remark [T] 

e„ (5) <eH(X,r:^ (5),n) e^^-i') 

while 

= e 



^/nr'x (V2aH{X)n ^ + rjn a 



for some constant r]i > 0, and 

g-nrx(5) 

= exp < —n 



l + a 

rjn 2 



exp < —n 



exp < — n — 



-o(l) 



2aj,{X) 
V2v 



+ In 2 



MX) 
V2r] 



(5.19) 
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since a E (O, ^) . Now it is trivial to see that we can select a constant rj such that 



y^^JT (--.N 



l + g 
2 



l + g 

n 2 



which will make ( |5.19| ) satisfied, and consequently 

6 = V2aH{X)n~^ +rin 

> Rn{tn)-H{X). 

In the similar manner, we can show that by making 5_ = \/2aH{X)n~^ — -q'n'^ for 
another constant 7]' > 0, 

en(5) > en- 

Consequently, 



Inn 

~2n 



0{n-^) 



for a G (O, I). The proof of ( |5.6[ ) for the case a G [|, l) is essentially the same, and 
therefore omitted. 

(c) Following the same spirit of the proof for part (b), one can verify that constants rj and rj' 
can be chosen respectively such that 



and 



en {5) 



-"V n ' V n In ' 



V n In ' 



< 



n 



n 



> 



n 



n 



which, together with ( |5.1| ), proves ( |5.7| ). 
(d) It can be readily seen that by Theorem (b), 5 = ""^^ Q^^ 
choice to guarantee 

en{5) < e 

while 5 = ^Q-i ( e + ^^^fS ) will make 

en{6) > e 
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satisfied. (|5.8[) tlien follows immediately from (|5.1|) and the choices of 6 and S. 



This completes the proof of Theorem 10 



Remark 3. To show Theorem [10] provides a non-trivial bound, we claim that 

5 > rxiS) 

for < (5 < In \X\ — H{X). Indeed, recall the definition of 5{\) and 

< rx{6{l)) = H{X) + 5(1) - In \X\ 

which implies that 5(1) > In 1^*1 - H{X) or r^(5) < 1 for < 5 < In - H{X). The claim 
then follows immediately from the fact that rx(0) = 0. 

Remark 4. In Part (d) of Theorem [10} we can see that if e„ = e > 0.5 is selected, then 
-Rn(en) could be strictly less than H(X) for finite block length n\ This means that if the error 
probability is allowed to be slightly larger than 0.5, the rate of source code can be even less 
than the entropy rate. For an IID binary source with p = Pr{Xi = 1} = 0.12, Figure [3] shows 
the tradeoff between the error probability and block length when the code rate is 0.21% below 
the entropy rate, where in Figure [3} both the entropy rate and code rate are expressed in terms 
of bits. As can be seen from Figure [3} at the block length 1000, the error probability is around 
0.65, and the code rate is 0.21% below the entropy rate. Similar phenomenon can be seen for 
channel coding shown in [|3J. 



Remark 5. Related to Part (d) of Theorem 10 is the second order source coding analysis in [[5| 



with a fixed error probability < e < 1. Both results are concerned with the scenario where 
the rate is around the entropy rate in the order of and the error probability is a constant. 
However, the work in [[5| is asymptotic. On the other hand. Theorem 10 (( |5.1[ ) and Part (d)) is 



non- asymptotic and valid for any block length n. It reveals a complete picture about the tradeoff 
between the rate and error probability when the error probability is constant, or approaches 
with block length n at an exponential (Part (a)), a sub-exponential (Part (b)), a polynomial (Part 
(c) with a > 1), or a sub-polynomial (Part (c) with < a < 1) speed. 
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error probability vs. block length when rate is below entropy 
p=0.12, Entropy=0.529, Rate=0.528 




block length 

Fig. 3. Tradeoff between the error probability and block length when the rate is below the entropy rate with p — 0.12 
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